31 research outputs found

    New Algorithms for Position Heaps

    Full text link
    We present several results about position heaps, a relatively new alternative to suffix trees and suffix arrays. First, we show that, if we limit the maximum length of patterns to be sought, then we can also limit the height of the heap and reduce the worst-case cost of insertions and deletions. Second, we show how to build a position heap in linear time independent of the size of the alphabet. Third, we show how to augment a position heap such that it supports access to the corresponding suffix array, and vice versa. Fourth, we introduce a variant of a position heap that can be simulated efficiently by a compressed suffix array with a linear number of extra bits

    Standardized next-generation sequencing of immunoglobulin and T-cell receptor gene recombinations for MRD marker identification in acute lymphoblastic leukaemia; a EuroClonality-NGS validation study

    Get PDF
    Amplicon-based next-generation sequencing (NGS) of immunoglobulin (IG) and T-cell receptor (TR) gene rearrangements for clonality assessment, marker identification and quantification of minimal residual disease (MRD) in lymphoid neoplasms has been the focus of intense research, development and application. However, standardization and validation in a scientifically controlled multicentre setting is still lacking. Therefore, IG/TR assay development and design, including bioinformatics, was performed within the EuroClonality-NGS working group and validated for MRD marker identification in acute lymphoblastic leukaemia (ALL). Five EuroMRD ALL reference laboratories performed IG/TR NGS in 50 diagnostic ALL samples, and compared results with those generated through routine IG/TR Sanger sequencing. A central polytarget quality control (cPT-QC) was used to monitor primer performance, and a central in-tube quality control (cIT-QC) wa

    On the Number of Elements to Reorder When Updating a Suffix Array

    Get PDF
    Recently new algorithms appeared for updating the Burrows-Wheeler transform or the suffix array, when the text they index is modified. These algorithms proceed by reordering entries and the number of such reordered entries may be as high as the length of the text. However, in practice, these algorithms are faster for updating the Burrows-Wheeler transform or the suffix array than the fastest reconstruction algorithms. In this article we focus on the number of elements to be reordered for real-life texts. We show that this number is related to LCP values and that, on average, Lave entries are reordered, where Lave denotes the average LCP value, defined as the average length of the longest common prefix between two consecutive sorted suffixes. Since we know little about the LCP distribution for real-life texts, we conduct experiments on a corpus that consists of DNA sequences and natural language texts. The results show that apart from texts containing large repetitions, the average LCP value is close to the one expected on a random text

    efficient dynamic suffix array

    No full text
    International audienc

    Average Complexity for Updating a Suffix Array

    No full text
    International audienc

    Average Complexity for Updating a Suffix Array

    No full text
    International audienc

    Dynamic Burrows-Wheeler Transform

    No full text
    International audienc
    corecore